WEIGHT DISTRIBUTION AND DECODING OF CODES ON HYPERGRAPHS 
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Abstract. Codes on hypergraphs are an extension of the well-studied family of codes on bipartite 
graphs. Bilu and Hoory (2004) constructed an explicit family of codes on regular t-partite hyper- 
graphs whose minimum distance improves earlier estimates of the distance of bipartite-graph codes. 
They also suggested a decoding algorithm for such codes and estimated its error-correcting capabil- 
ity 

In this paper we study two aspects of hypergraph codes. First, we compute the weight enu- 
merators of several ensembles of such codes, establishing conditions under which they attain the 
Gilbert- Varshamov bound and deriving estimates of their distance. In particular, we show that this 
bound is attained by codes constructed on a fixed bipartite graph with a large spectral gap. 

We also suggest a new decoding algorithm of hypergraph codes that corrects a constant fraction 
of errors, improving upon the algorithm of Bilu and Hoory. 



I. Introduction 

Codes on graphs account for some of the best known code families in terms of their error cor- 
rection under low-complexity decoding algorithms. They are also known to achieve a very good 
tradeoff between the rate and relative distance. The most well-studied case is codes defined on a 
bipartite graph. In this construction, a code of length N = mn is obtained by "parallel concate- 
nation" of 2m codes of a small length n which refers to the fact that each bit of the codeword is 
checked by two independent length-n codes. The arrangement of parity checks is specified by the 
edges of a bipartite graph which are in one-to-one correspondence with the codeword bits. 

Codes on bipartite graphs are known to be asymptotically good, i.e., to have nonvanishing rate 
R and relative distance 5 as the code length N tends to infinity. Constructive families of bipartite- 
graph codes with the best known tradeoff between R and 5 have been found by the present authors 
in. In particular, codes constructed in that paper surpass the product bound on the minimum 
distance which is a common performance benchmark for concatenated constructions. 

Moving from constructive families to existence results obtained by averaging over ensembles 
of bipartite-graph codes, it is possible to derive even better rate-distance tradeoffs. In particular, 
bipartite-graph codes with random local codes and random bipartite graphs attain the Gilbert- 
Varshamov (GV) bound for relatively small code rates and are only slightly below it for higher 
rates [IJ. 

A natural way to generalize codes on bipartite graphs is to consider concatenations governed 
by regular f-partite hypergraphs, f ^ 2. This code family was studied by Bilu and Hoory in [2J. 
While constructive families of bipartite-graph codes rely on the expansion property of the under- 
lying graph, expansion is not well defined for hypergraphs. Instead, HI put forward a property 
of hypergraphs, called £-homogeneity, which replaces expansion in the analysis of hypergraph 
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codes. [2] showed that there exist explicit, easily constructible families of e-homogeneous hyper- 
graphs, and estimated the number of errors corrected by their codes under a decoding algorithm 
suggested in that paper. 

In this paper we study hypergraph codes both from the perspective of weight distributions and 
their decoding. The results of [IJ on weight distributions are advanced in several directions. In 
Theorem|2]and its corollary we prove that the code ensemble defined by random regular t-partite 
hypergraphs and random local linear codes contains codes that meet the GV bound. The region 
of code rates for which this claim holds true extends as t increases from the value t = 2. We also 
show (Theorem [71 Cor.[8l) that the ensemble of hypergraph codes contains codes that attain the GV 
bound even if random hypergraphs are replaced with a fixed e-homogeneous hypergraph. Spe- 
cializing the last result for f = 2, we establish that expander codes of Sipser and Spielman 13 
constructed from a fixed graph with a large spectral gap and random local codes with high prob- 
ability attain the GV bound. Finally, we derive an estimate of the average weight distribution for 
the ensemble of hypergraph codes with a fixed local code (see Theorem|5]l that refines substantially 
a corresponding result in [IJ and generalizes it from f = 2 to arbitrary f. 

The tradeoff between the rate and relative distance of hypergraph codes shows an improvement 
over bipartite-graph codes for small values of the distance. On the other hand, the decoding algo- 
rithm of [2j does not exploit the full power of their codes; moreover, for small S the proportion of 
errors corrected by it vanishes compared to the value of the distance. Motivated by this, we pro- 
pose a new decoding algorithm of hypergraph codes and estimate its error-correcting capability. 
We show that it corrects the number of errors which constitutes a fixed proportion of the code's 
distance. 

I-A. Codes on bipartite graphs. Let G = ( V, E) be a balanced, n-regular bipartite graph with the 
vertex set V = V^U Vi, \ V\ \ = IV2I = w and |£| = N = nm edges. Let us choose an arbitrary or- 
dering of edges in E. For a given vertex v ^ V this defines an ordering of edges v{l), v{2), . . . , v{n) 
incident to it. We denote this subset of edges by E{v). Given a binary vector x S {0, l}'^, let us es- 
tablish a one-to-one correspondence between the coordinates of x and the edges in E. For a given 
vertex v let x{v) = {xe,e G E{v)) be the subvector that corresponds to the edges in E{v). Denote 
by A the second largest in the absolute value eigenvalue of the graph G. 

Consider a set of binary linear codes A-o[n,Rin] of length n and rate Ri = dim(Av) / n, where 
V ^ V. Define a bipartite-graph code as follows: 

C(G,{AJ) = {xe {0,1}~ : \/^^v,uvA^) S A^}. 
The rate of the code C is easily seen to satisfy 
(1) R(C)^2Ri-L 

If we assume that all the local codes are the same, i.e., A^, = A, where A[n, Rifi, di = Jjn] is some 
linear code, then the distance of the code C can be estimated as follows: 

(we will write C{G,A) instead of C(G, {A}) in this case). In particular, if the spectral gap of G 
is large, i.e., A is small compared to di, then the relative distance d/N is close to the value 5\, 
similarly to the case of the direct product code C = A (8) A. 

The weight distribution of bipartite-graph codes constructed from random regular bipartite 
graphs and a fixed local code A with a known weight distribution was analyzed in HSHH. In par- 
ticular, it was shown that if A is the Hamming code l-Lm then the ensemble C= (C(G, A)) contains 
asymptotically good codes. Generalizing these results, paper [1 J studied the weight distribution of 
bipartite-graph codes with fixed and random component codes A. It was shown that for m — > 00 
the ensemble of codes constructed from random regular bipartite graphs and a fixed code A with 
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distance di ^ 3 contains asymptotically good codes. It has also been shown fT\ that if the local 
codes are chosen randomly, then the code ensemble C contains codes that meet the GV bound in 
the interval of code rates R(C) ^ 0.2. 

I-B. Codes on hypergraphs. Generalizing the above construction, let H = (y, E) be a i-uniform 
f-partite n-regular hypergraph. This means that the set of vertices V = V^U ■ ■ ■ UVf of H consists 
of t disjoint parts of equal size, say, \Vi\ = m,l ^ i ^ t. Every hyperedge {vi^, Vj^, . . . , y,,} contains 
exactly t vertices, one from each part, and each vertex is incident to n hyperedges. Below for 
brevity we say edges instead of hyperedges. The number of edges of H equals N = mn which 
will also be the length of our hypergraph codes. As above, assume that the edges are ordered in 
an arbitrary fixed way and denote by E(y) the set of edges incident to a vertex v. For definiteness, 
let us assume that edges e(i-i)n+jri = l,---,n are incident to the vertex Vj G Vi, 1 ^ z ^ m. 

Given a binary vector x G {0, 1}^ whose coordinates are in a one-to-one correspondence with 
the edges of H denote by x{v) its subvector that corresponds to the edges mE{v). 

Define a hypergraph code as follows: 

C(H,{AJ) = {x G {0,1}^ : V,eyx(y) G A,}, 

where {Au,v G V} is a set of binary linear codes of length n. As above, if all the codes are the 
same, we write C{H, A). Assume that all the codes Ay have the same rate Ri, then the rate of the 
code C satisfies 

(2) R(C) ^iRi-(f-l). 

Definition 1. |'2'| A hypergraph H is called e-homogeneous if for every t sets Di, D2, . . . , Df with Dj C Vj 
and Id,- I = a,m. 



(3) ^ n a,- + e ^ rmn ^ 



where E(Di, D2, . . . , Df) denotes the set of edges that intersect all the sets Dj. 

This definition quantifies the deviation of the hypergraph H from the expected behavior of a 
random hypergraph. For t = 2 the well-known "expander mixing lemma" asserts that 



|£(Di,D2)| 

■ — CiiCi2 



N 



A 

^ —0ii0i2, 

n 



showing that regular bipartite graphs are A/ n-homogeneous. This inequality is frequently used 
in the analysis of bipartite-graph codes |5. 6|. 

Let A[n,Rin,di = 5in] be a binary linear code. The distance of a code C(H, A) where H is 
£-homogeneous satisfies 121 

(4) d/N^st' -ci{£,5i,t) 

where Ci ^ as e ^ 0. 

One of the main results in f2] gives an explicit construction of e-homogeneous hypergraphs 
H starting with a regular graph G(LJ, E) with degree A and second eigenvalue A. Putting Vi = 
U,i = 1,2, ... ,t and introducing a hyperedge whenever the f vertices in the graph G are con- 
nected by a path of length f — 1, that paper shows that the resulting hypergraph is n-regular and 
e-homogeneous with n = A*^^,£ = 2{t — 1)A/A. Therefore, starting with a family of A-regular 
bipartite graphs with a large spectral gap, one can construct a family of regular homogeneous 
hypergraphs with a small value of e. Paper [2J has also established that random n-regular hyper- 
graphs with high probability are 0(1/ i/n) -homogeneous. 
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II. Weight distributions 

Below we consider ensembles of random codes on graphs and hypergraphs. In some cases the 
(hyper)graph will be selected randomly. In the case of bipartite graphs this is done as follows. 
Connect the edges e(i-i)n+j'i = 1, . . . , n to the vertex Vj G Vi, i = l,...,m. Next choose a per- 
mutation on the set E with a uniform distribution and connect the remaining half-edges to the 
vertices in V2 using this permutation. Similarly, to construct an ensemble of random hypergraphs, 
we choose t — 1 permutations independently with uniform distribution and use them to connect 
the parts of H. 

Random linear codes are selected from the standard ensemble of length-n codes defined by 
n(l — Ri) X n random binary matrices whose entries are chosen independently with a uniform 
distribution. 

We consider the following three ensembles of hypergraph codes. 

Ensemble Qi{t). A code C{H, {Ai, . . . , At}) € Ci(f) is constructed by choosing a random t- 
partite hypergraph H and choosing random local linear codes A, of length n independently for 
each part V, G V. 

Ensemble Gii^, A). A code C(H, A) S 62 is constructed by choosing a random i-partite hyper- 
graph H and using the same fixed local code A[n, R^n, di] as a local code at every vertex. 

Ensemble Qs^t, H). A code C(H, {A^,}) from this ensemble is formed by choosing a fixed, non- 
random hypergraph H and taking random local linear codes Aj, independently for each vertex 
veV. 

Our purpose is to compute ensemble-average asymptotic weight distributions for codes in these 
ensembles and to estimate the average minimum distance assuming that m ^ 00 and n is a con- 
stant. The case t = 2 corresponds to ensembles of bipartite-graph codes, some of which were stud- 
ied in llHSllll. Below we will cover the remaining cases for the code ensembles S,(f), i = 1, 2, 3 and 
any t ^ 2. Below Bf„ = Br(;(C) denotes the number of codewords of weight iv. Before proceeding, 
we note that upper bounds on the ensemble-average weight distribution in many cases also give 
a lower bound on the code's distance. 

Lemma 1. Suppose that for an ensemble of codes C of length N there exists an coq > such that 

lim V EB-,a = 0. 

Then for large N the ensemble contains codes whose relative distance satisfies d/N ^ coq. 
The proof is almost obvious because 

Pr[d(C) ^ cooN] ^ X] MMC) ^ 1] ^ E EB,„(e). 



II-A. Ensemble ei(0- 

Theorem 2. For m ^ co the average weight distribution over the ensemble of linear codes Si(f) of length 
N = mn and rate ^ satisfies EBa,N ^ 2^*^^+"^), where 

(5) f = a;ilog2(2(i-^)/*-l) - (f-l)?2(a;), if ^ a; ^ 1 - 2(^-1)/* 

(6) F = h{co)+R-l if a; ^ 1-2(^-1)/*, 
and 

7 ^ tn-^l + log^n) + (f/2N) log2(2N), 
h{z) = -zlog2Z- (l-z)log2(l-z). 



CODES ON HYPERGRAPHS 5 

Proof : The proof is an extension of the corresponding result for t = 2 in fT]. Let C„ i = 1, . . . ,the 
the set of vectors x G {0, 1}^ that satisfy the linear constraints of part Vj of the hypergraph H so 
that C{H,A) = n,C;. Let Pi = Fr[x G C;]. The events x S Cj for different i are independent, and 
therefore 

Fr[x e C] = P/ 

(for any i = 1, . . . ,t). Let Ba,(C,) be the random number of vectors of weight zv in the code C,. 
Then 



.Mc,^(:)N.ec,.(:)ri^, 



Let Xs z„ be the set of vectors of weight w = coN whose nonzero coordinates are incident to some 
vertices Vi^, ■ ■ ■ ,Vi^ G Vi, s'^ w/n. Let Wj = w{x{vi.)),j = 1,. . . ,s and let coj = Wj/n. We have 

p.»i=(:) E n(:)<(:) e 

\^/ lVi,...,Ws j=l \"^]/ \*/ roi,...,tUs 

By convexity of the entropy function, the maximum of the last expression on o^i, . . . , a;, under the 
constraint n cOj = coN is attained for cOj = com/ s,/ = 1, . . . , s. Since the sum contains no more 
than terms, we obtain 

I < 2"th{x)+slogn+snh{wm/s) ^ 2j^i^^{'^/^)^^) 

where x = s/m and e = (1 + log n)/n. A vector x S 'Xs,w is contained in Q with probability 
and the same expression is true for EBi[,(Cj), / = 2, . . . , f. Therefore, 

Since t{Ri - 1) ^ R - 1, we obtain EB^„(C) ^ 2'^(^(^)+t), where 

f (a;) ^ -(f - l)h{co) + t max (x(Ri - 1 + h{co/x))) 

0)^X^1 

^ -(f - l)/j(a;) + max {x{R - 1 + f//(a;/x))). 

co^x^l 

The maximum on x of x[R — 1 + th{co / x)) is attained for x = xq = co / {\ — z) where Hog2 z = 
R — \. The two cases in the theorem are obtained depending on whether a:o < 1 or not. If xq < 1, 
we substitute Xq in the expression for F{co) and obtain 

F{co) ^ -{t-l)h{co)+cvtlos^^ 

which implies (O on account of the identity R — 1 + th{z) = t{l — z) log2(z/ (1 — z) ). If Xq ^ 1/ we 
substitute the value a: = 1 to obtain (|6]|. I 

Corollary 3. Let cv* be the only nonzero root of the equation 

a;(R-l-Hog2 (l - 2(^-i)/')) = {t-l)h{co). 

Then the average relative distance over ensemble Qi{t) behaves as 

6{R)^co*, /f R^log2(2(l-^Gv(K))0 

5{R) ^ ^Gv(^), if R> log2(2(l - 5cv{R)Y), 

where i5gv(^) — h^^{\ — x). 
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The proof is analogous to the proof of Corollary 4 in |T| and will be omitted. 

For t = 2 we proved in [IJ that ensemble Ci contains codes that reach the GV bound if the code 
rate satisfies ^ R ^ 0.202. This result forms a particular case of the above corollary. Increasing 
t, we find that the ensemble contains codes that reach the GV bound for the values of the rate as 
shown below: 

t = 3 4 10 

R ^ 0.507 0.737 0.998. 

Thus already for t = 10 almost all codes in the ensemble Ci attain the GV bound for all but very 
high rates. 

II-B. Ensemble 62(^7 A). In this case the results depend on the amount of information available 
for the local codes. Specifically, [1] shows that for t = 2 the ensemble contains asymptotically 
good codes provided that the distance of the local code A is at least 3. In the case when the weight 
distribution of the code A is known, a better estimate is known from [3, 4J. 

Theorem 4. Let A be a linear code of length n with weight enumerator a{x) = YLl^Q^i^^- the 
random number of codewords of weight w of a code C(H, A) G C2(f, A). Then its average value over the 
ensemble satisfies 

^im 1 log2 EB^N ^ - (^ - + ^ " '*'^) ' 

where s* is the root of (lnfl(e^))s = nco. 

This theorem enables us to estimate the asymptotics of the mean relative distance 5 = lim ^^^irr^ 
for the ensemble 62- Let us consider several examples. 

1. Let t = ?> and let A be the Hamming code of length n = 15 and rate Ri = 11/15. Then the rate 
^(62) ^ 0.2 and the distance ^5 = 0.2307. The relative GV distance for this rate is ^^Gv (0.2) = 0.2430. 

2. Let i = 3 and let A be the Hamming code of length n = 31. Then ^(62) ^ 16/31 and S = 
0.0798. Using the same code with f = 4 gives R(e2) ^ 11/31 and 3 = 0.1607 while <5gv(11/31) = 
0.1646. 

3. Let f = 3 and let A be the 2-error-correcting primitive BCH code of length n = 31 and rate 
Ri = 21/31. Then the rate R(e2) ^ 1/31 and the value of ^ is 0.3946608. The relative GV 
distance for this rate is ^gv(1/31) = 0.3946614. 

Let us turn to the case when only the minimum distance di of the code A is available. In HI 
we addressed the case t = 2, proving that as long as di ^ 3, there exists an £ > such that the 
ensemble-average relative distance 3 > e as m ^ 00. In the next theorem this result is extended to 
arbitrary t ^ 2. We also prove a related result which gives an upper bound on the average weight 
spectrum and provides a way of estimating the value of coq. 

Theorem 5. (a) Let A be the local code of length n and distance di used to construct the ensemble C2(f, A) 
of hypergraph codes. Let xq = xo(a;) be the positive solution of the equation 

(7) con + {con - i)x' = 0. 

The ensemble-average weight distribution satisfies 

lim i logEB„„ < L log ii^l^ - (/ - 

(b) The inequality di > t/{t — 1) gives a sufficient condition for the ensemble to contain asymptotically 
good codes. 
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Proof : In the proof we write d instead of di to refer to the distance of the code A. 

(a) Let H be a random hypergraph and C{H,A) be the corresponding code. Recall that C = 
n,C„ where C, is the set of vectors that satisfy the constraints of part i of the graph. Let Ui{w, d) 
be the set of vectors x G {0,1}'^ such that w(z) = w and w{x{v)) = or w(x(y)) ^ d for all 
V G Vi- Since the number of such vectors is the same for all i, below we write \ll{w,d) \ omitting 
the subscript. Let us choose a vector x G {0, 1}^ randomly with a uniform distribution. Then 

Vr[x G Ci| w(x) = ^ ' \ ' J\ 

and for i ^ 2, 
Then 



( ) 

Pr[x G Q| w(x) = w,x E Ci\ = Pr[x G Q| w(x) = w]. 
EB^(C) = (^^^ Pr[x G C| w(x) = ly] = (^^^ (Pr[x G Q] w(x) = w])* 

(8) ^ 



\U{w,d)\^ 



( ) 

Given a vector x denote by ;V the number of vertices v EVi such that w(x(u)) = I. Clearly, 

m^.i)\= E (,„,;" ,)n^"'" 

This sum contains no more than (m + 1)" = 0{N") terms, so for N — > oo its exponent is deter- 
mined by the maximum term (which has exponential growth). We obtain 

(9) ^ log \U {cvN, d)\^ - max |/2(vo, v^, v^+i, . . . + Y] i/£ log ( " ) | + i^^, 

where = i(/m,£ = 0,d,d + 1, . . . ,n, and h{x) = — log x,. The objective ftinction is concave, 
so the point of extremum is found from the system of equations 



) (1 - ^ i/f ) = Viji ', i = d,d + l,...,n 



Its solution is given by 



n 

^ £v£ = con. 

l=d 



^''^ i = d,d + l,...,n. 



' 1 + LLdO'' 

where fi is chosen so as to satisfy the last equation of the system. Evaluating ivi and writing 
X instead of }i, we observe that it should satisfy Eq. ((T)). This equation has a unique root xq > 
because putting x = p / (1 — p), we can write it as 

where X is a binomial (p, 1 — p) random variable. As p changes from to 1, the left-hand side 
of the last equation decreases monotonically from +oo to con while the right-hand side increases 
monotonically from d ton. 



8 



A. BARG, A. MAZUMDAR, AND G. ZEMOR 



Finally, computing the entropy and simplifying, we obtain the estimate 

ltailogltl(..N,i)f<logl±^i<M. 

(b) The proof of the second part is analogous to the case of f = 2 in [IJ. Let iv,l iv Nhe the 
weight and let p = vol d. We have 



\U{w,d)\ ^ 



i=iv I n 



n\ j n\ j in 
i)\d) \{p-i)d 



imxinY ^ / pn 



p J \d 



Then 



ni\ fn\^ ^„„\^ fN'^^ ' 
w . 



EB.,(C)<((;)02.. 
Using the estimates [jY ^ ^ (y)'^/ we compute 



\ p J \N 
= (sm/a;)7(f-'*(f-i)) 

1 

where s = [{ed2"yn ) '-''('-i) . Thus, for any co satisfying w < s/m, the average number of vectors 
of weight LoN tends to as m ^ oo as long as d{t — \) > t. This proves that under this condition 
the ensemble contains asymptotically good codes. I 

Examples. Let A be the [7, 4, 3] Hamming code and let t = 2. Theorem ^a) implies a lower 
bound S ^ 0.01024 on the average relative distance for the ensemble 62(2, A). This improves upon 
previous results ([3, 4J; also Part (b) of this theorem) which assert only that the ensemble contains 
asymptotically good codes. Of course, in this case we can use the entire weight distribution of 
the code A to find the estimate S ^ 0.186 from Theorem HI however, in cases when the weight 
distribution is difficult to find, the last theorem provides new information for the ensemble of 
graph codes. 

Similarly, for A [23, 12, 7] from TheoremlSja) we obtain the estimate 3 ^ 0.0234. Again, using the 
entire weight distribution, it is possible to obtain a better estimate. 

Part (a) of the last theorem implies the following corollary which shows what happens to the 
average weight spectrum of the ensemble for long local codes. 

Corollary 6. Let di = Sin. Then 

^ logEB^N(C) ^ -{t- l)h{cv) + 7 

where 7 ^ (log N) /m + (log n) / n. 
Proof: In lO let us bound above /?(■) by logn. Then 

, . t 

N 



t " fn\ 

log|Lr(w,di)|f ^ - ^max^ ^ v^log (^^J +7. 
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Computing the maximum amounts to solving a linear programming problem whose dual is 

com min 



Iz ^ log 



di,di + 1, . . . ,n; z ^ 0. 



Its solution is given by z* = con max^f^^^^^ log (") /£. We obtain 
1 



1 \i{ x\ 

— log |iJ(z<;,di)|* ^ t(jj max -77^ + 7 ^ tcoh^dx) I di^ ^. 



Employing (|8]) now completes the proof. 1 
Il-C. Ensemble 63 (f,H). 

Theorem 7. Assume that H is e-homogeneous. For m ^ co the average weight distribution over the 
ensemble of linear codes Q^{t,H) satisfies EB^n ^ 2'^'^^+'^) where 

F = -xo(l-R)+4^(^), ifxQ<l 

F = h{cv) +R-1, ifxQ ^ 1, 
where Xq is the unique -positive root of the equation 

(10) tx*-Mog(x7(x*-a;)) = 1-R 

7 = f(n + logm)/N + £. 

Proof : Let C G Q^{t,H) and let x G {0, 1}^ be a nonzero vector. Denote by B/ the set of nonzero 
vertices of x in the part Vi,i = 1, . . . , t. Let E = |E(Bi, B2, . . . , Bf) |. Let fc, = |B,|, /3, = &,/m, then the 
probability that x G C equals 2^(^^^i'^^'^'. Assume w.l.o.g. that < ^2 < ■ ■ • < ^t- The average 
number of vectors of weight w = coN in the code C can be bounded above as 



w 



n 

i=l 



2-(l-Ri)NE;/i/_ 



Then 



ilogEB„« < ^^ax_ {nft).(j^) - (1 - K,)Efi,} + 
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Let (pifii, . . . ,j5t) be the function in the brackets in the last expression. Let us prove that (p is 
concave in the domain V = 11; 1] H { (j6i, . . . , j6t) : Yli f^i ^ f^}- Computing its Hessian matrix, 
we obtain 



r ^1 


S2 


S2 -| 




fulfil ■ 




S2 


Si 


S2 


fulfil 


W2 




S2 


S2 


Si 






■ /5? J 



where 



loge 



S2 = si+ni6/ln(l- 



The matrix Ha can be written as 



Htp = - log ^(sizz* + (si - S2)diag(,6-^^. . .,j6f 2)) 
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where z = (1/ j6i, . . . , 1/ jitY and diag(-) denotes a diagonal matrix. We wish to prove that is 
negative definite for /3, > 0, < a; < Yli ^i- Clearly, si > S2, and therefore the claim will follow 
if we show that S2 > 0. This is indeed true because letting Q = Yii srid using the inequality 
X > ln(l + x) valid for x > —1, x 0, we have 

We will now show that the maximum of (p mT> is attained on the line i given by j6i = (^2 = 
• • • = jit- Note that V is an intersection of convex domains and therefore itself convex. Moreover, 
the domain D is also symmetric in the sense that together with any point p = (/3i, . . . , fit) it also 
contains all the points obtained from p by permuting its coordinates, and the value of (p at each of 
these points is the same and equal to (p{p)- Because (p is strictly concave, for any point p ET>,p ^ I 
it is possible to find a point q such that (p{cj) > (p{p) (any point q on the segment between p and 
one of its symmetric points will do). This shows that the global maximum of ^ in I? is attained on 
£ including possibly the point /3i = ■ ■ • = j6t = 1. Thus, we obtain 

ilogEB,„^ max {-{1 - R)x + x*h(^)} + j. 

The maximum of this expression on x is attained for x determined from fTOl) . This equation has a 
unique positive root xq because the left-hand side is a falling function of x that takes all positive 
values for x G {co^^^, oo). This concludes the proof. I 

This theorem implies the following result. 

Corollary 8. For all values of the code rate satisfying R ^ log(2(l — (5gv(^))0/ almost all codes in the 
ensemble 63(0 approach the GV bound as N ^ 00. 

Proof : From the previous theorem, the GV bound is met for the first time when xq becomes 1. 
Substituting 1 in ((TO)) , we obtain a condition on co in the form co = 1 — 2^^^^^^''. As long as this 
value is less than Sqy{R), the ensemble-average relative distance approaches Sgv{R) as N ^ 00. 1 

We note that the condition for the attainment of the GV bound turns out to be the same as for 
the ensemble Ci(f) constructed from random graphs. The e-homogeneity condition, and in partic- 
ular, the expander mixing lemma for bipartite graphs are known to approximate the behavior of 
random graphs. This approximation turns out to be good enough to ensure that both ensembles 
contain GV codes in the same interval of code rates. Moreover, for small weights the average num- 
ber of codewords for the ensemble C3(f, H) turns out to be smaller than for the ensemble Ci(f). 
This is illustrated in 2 examples in Fig. [H 

For t = 2 codes in the ensembles C3 and Ci reach the GV bound for code rates R ^ 0.202. For 
R > 0.202 the codes are still asymptotically good, although slightly below the GV bound. For 
these values of the rate, the average relative distance for the ensemble C3 is greater than for the 
ensemble Ci as shown by the following numerical examples. 



R 0.3 0.5 0.7 0.9 

61(2) 0.18558 0.09276 0.03211 0.00337 
63(2, H) 0.18605 0.09492 0.03242 0.00380 



Similar relations between the weight spectra and distances of the ensembles Ci {t), Q3{t, H) hold 
also for larger values of f . 
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Figure 1. Average weight spectra for ensembles of graph codes: (T) t = 2,R = 
0.2, (II) t = 3,R = 0.4; (a) ensemble 63(2, H), (b) ensemble 61(2), (c) ensemble of 



random linear codes. 



III. Decoding 



For the case of a code C{G,A) on a bipartite graph G, decoding can be performed by a natural 
algorithm ||6l that alternates between parallel decoding of local codes in the parts Vi and V2 until, 
hopefully, it converges to a fixed point. In this algorithm, the most current value of each edge (bit) 
is stored at the vertex in the part decoded in the most recent iteration. However, pursuing such an 
edge-oriented procedure is difficult for i > 2. In [2J the following alternative is suggested: starting 
from the values of the bits stored on the edges of H decode in parallel all local codes in all parts of 
H and for each v ^ V form an independent decision about the codeword of A that corresponds to 
the edges E{v). Next, the values of the bits at every vertex are updated, so that now every vertex 
stores an independent opinion of its bits' values. For the update, the value of the bit Xe{v) is set 
to the majority value of the decoded versions of this bit at all the vertices v' G e\v, where e 3 v is 
an edge (for this to be well-defined, the values of t are assumed to be even). The decoding then 
iterates, repeating this parallel decoding round until all the vertices agree on all bits. 

In 121 this algorithm is shown to correct all patterns of errors provided that their proportion, as 
a fraction of the blocklength N, is less than 



where C2 (e, Ji, f ) — > as e — > 0. This algorithm consists of log N iterations, each of which has serial 
running time linear in the blocklength N. Its analysis relies on the e-homogeneous property of H. 

For fixed values of f > 2, if one thinks of 3i as a variable quantity, then the number of correctable 
errors in fTTl) is not a constant fraction of the designed distance (|4|. For example, for i = 4, fTTl) 
gives a decoding radius equal to N times the fraction 



For small this is a much smaller quantity than the designed distance S-^ N. This consideration is 
reinforced by the fact that advantages of hypergraph codes are most pronounced for small values 
of the distance S. 

Our objective is to propose an alternative decoding strategy that decodes a constant fraction of 
the designed distance. 



(11) 




2v/6' 



12 



A. BARG, A. MAZUMDAR, AND G. ZEMOR 



For every i, we shall define a i-th sub-procedure that decodes the subcode A on every vertex 
belonging to the vertex set V,. We shall claim that if the initial number of errors is less than a 
bound that we shall introduce, then /or at least one i, the /-th subprocedure applied to the initial 
error pattern produces a pattern with a smaller number of errors. 

Let us now describe the decoding procedure in more detail. For every vertex v, and the asso- 
ciated subspace {0, 1}" where coordinates are indexed by the edges incident to v, we will use the 
following threshold decoding procedure of the constituent code A. This means that we introduce 
a number k ^ 2, to be optimized later, and that we decode a vertex subcode only if its Hamming 
distance to the closest codeword is less or equal to 6 = di/x. If every codeword of A is at distance 
more than di/x we leave the subvector untouched. Let Vj = (u, i, . . . , „,) be the ith component 
of H. Given an N-vector z = (z(w, i), . . . ,z(y, „,)), we can decode each of the m of its subvectors 
with Tk, obtaining an N-vector w. Abusing notation, we will write w = (z) . The i-th subprocedure 
now consists of applying to the component V,. 

As mentioned above, we shall claim that one among t of the f-th subprocedures lowers the 
total number of errors. However the decoding algorithm will not be able to discern which of the 
z'-th subprocedures is successful. So the decoder will apply all t subprocedures in parallel to the 
received vector, yielding t output vectors. The next decoding iteration will have to be applied to 
every output of the preceding iteration, so that s iterations of the algorithm will yield f output 
vectors. We will only apply the algorithm for a constant number of iterations however, until we 
are guaranteed that the number of remaining error for at least one of the f outputs has fallen 
below the error-correcting capability of Bilu and Hoory's decoding procedure. We then let the 
latter decoder take over and decode all f candidates. At least one of them is guaranteed to be the 
closest codeword, and it can be singled out simply by computing the Hamming distance of every 
candidate to the initial received vector. 

To give a more formal description of the algorithm, suppose that y S {0,1}^ is the vector 
received from the channel. In each iteration the processing is done in parallel in all the vertices of 

H. Let '3^1 = {y'^pi } be the set of N-vectors stored at the vertices of the component Vi before the jth 

iteration. By the discussion above, \'3^- {v) \ ^ V^^ . 

We begin by setting = {y} for all /. Iteration = 1,2, ... ,s consists of running t parallel 

subprocedures. The zth subprocedure applies decoder to every vector in the set '3/'^ , re- 
placing it with the vector TK{yP),l = The outcome of this step creates t potentially 
different decodings of every vector y^ G "3^/ , i = 1, . . . ,t. In the second part of the iteration we 

form the sets '3^i^^, i = 1, . . . , i by replacing each vector y^ S with its decodings obtained in 
all the t subprocedures. 

Next, we prove that one of the t subprocedures will actually diminish the number of errors. 
This analysis also relies on £-homogeneity, although in a way different from [2J. Let £ be the set 
of coordinates, i.e. the set of edges, that are in error. For every / = 1 . . . let us partition the set of 
vertices in Vj that are incident to £ into three subsets, G„ N,, B,. The set G, is the subset of vertices 
that will be correctly decoded, N, is the subset of vertices that are left untouched by the threshold 
decoder, and B, is the set of those vertices that are wrongly decoded to a parasite codeword of A. 
The situation is summarized in Figured From now on by the E-degree of a vertex we shall mean 
the degree of this vertex in the subhypergraph induced by the edge set £. It should be clear that 
every vertex of G, has £-degree not more than di / k, every vertex in N, has £-degree at least di / k, 
and every vertex in B,- has £-degree at least (k — l)di / k. 

We use the shorthand notation £(G,) to mean the set of edges that has one of its endpoints in 
G,. Similarly we shall write £(N,) and £(B,). 
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(bad) vertices in error that 
will be badly decoded 



(neutral) vertices in error that 
are left untouched 



(good) vertices in error that 
will be correctly decoded 



Figure 2. Details of the set of vertices incident to edges in error. The max £- 
degree in G,- is less than di / k, the min £-degree in Bj is at least (k — / k, the min 
£ -degree in N, is at least di/K. 

Lemma 9. If the i-th decoding subprocedure introduces more errors than it removes, then \ £ (G,) | ^ | £ | /k. 
Moreover, if 

u.- ^ = l t 

^' " |£(N,)U£(B,)r 

then 

|g(G,.)| ^ ^^|£|. 

Proof : The first part of the lemma follows from the second part, which is proved as follows. We 
boimd from above |£(G,) |, the set of edges removed, by the set of edges added, |£(B,) |: we get 



|£(Q 



^ |B,|^ = |B,|rf,fl-i) J- 

K \ K J K—1 



^ |£(B^' ^ 



K — 1 



The first inequality comes from the definition of k and the threshold decoder. The second inequal- 
ity states that (1 — \/K)di is a lower bound on the minimum £-degree in B,. We now have 

(12) |e| = |s(Gi)| + |e(Nj)| + |£(Bj)| = |e(Gi)| + |e(Bi)|/(i-ft) 



which proves the lemma. 



Theorem 10. For any a > 0, if the number of errors eN is such that 

jf/(f-i) 

(13) e^(l-a)- 1 



(f + l)(m)/(f-i) 
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they can he corrected in time 0(N log N). 

Proof : The theorem will follow if we show that at least one subprocedure reduces the error cour\t 
by a constant fraction. Indeed, in this case a constant number of rounds of the above algorithm 
will reduce the error count to any positive proportion of the designed distance whereupon the 
remaining errors will be removed in 0(log N) steps of Bilu-Hoory's algorithm. 

Assume toward a contradiction that all the i-ih decoding subprocedures, i = 1 . . .t, introduce 
more errors than they remove. Let us introduce the following notation: |£| = eN,Si = B, U N,-, 
|S, I = Cim. Note that since the minimum £-degree in S/ is at least di / k, we have 

(14) ai ^ Ke/Si. 

Consider the subset of edges obtained from £ by removing all edges incident to "good" vertices 
G; for all i. We are left with a subhypergraph with vertex set Sj, i = 1 . . A. Use Lemma |9] (the 
first part) for all i to argue that the total fraction of edges in Hg is at least e(l — 11%). Applying the 
e-homogeneous property (O gives 



- I ^ c^i • • • + e min (cridi) 



1/2 



Applying ((T4]l we obtain 

A.- , 

K ) V^l / ^1 

This inequality does not hold (and therefore our assumption is false) if 



t\ ^ ( ■Ke\ KC 



(15) e < S'/^'-'^ 



1 - t/K-ex/Si^^^'^^'^^ 



Taking k = t + 1, rewrite the expression in the brackets on the right as 

1 ^(m)/(f-i)^_^ (^ + i)2£^j^ 



J + IJ V ^1 

By taking sufficiently large n it is possible to make £ small enough so that for any given a' > 
there holds 

(1- (i + l)2e/c5i)i/'+i >l-cc'. 
This means that ([TSt is satisfied for all 

e<{l-oc')- -1 



(i + l)(f+i)/(t-i)- 

Finally, choosing a' < a guarantees that at least one subprocedure reduces the error count by a 
constant fraction. I 

We see that the upper bound on the number of correctable errors given by Theorem [10] is a con- 
stant proportion 7 of the designed distance SN (UJ, where 7 = l/(f + l)('+i)/(f-i). Por example, 
for t = 3,4 we get 7 = 1/16 and 1/14.2, respectively. 

The next theorem provides a better estimate of 7 by refining the above analysis. The way this is 
done is to rely on the full power of Lemma |9] instead of its first part as above. 

Theorem 11. For any a > 0, if the number of errors eN is such that 

e ^ (1 — a.)s\^^^^^^ max min fiu.K) 

with 

[l-t{l-^)/{K-^)Y/(*-^) 
jy^'""! + (1 - ^)/(k - i)]f/(f-i) 
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they can he corrected in time 0(N log N). 

Proof : We proceed as in the previous theorem, assuming toward a contradiction that each subpro- 
cedure increases the error count. Using the definition of }ii given above. 



|£(S/ 



|£(B,))| _ |£(N,)| 



1 - Hi m 



Recall that the subhypergraph is formed of the edges all of whose vertices are in S,. To count 
the total fraction of edges /^(Hg) in the subhypergraph we employ Lemma|9l 

fi{H^)^e(l-t'~^' 



The £-degree of a vertex in Si (resp., Bj) is at least d\/K (resp., di (k — 1) /k). Hence 



\Si\ = %\ + IM-I ^ e-m^ + ^{Bi f}] 

fli fli(K — 1) 



^ Ke fl-Ui . 



Using the last two inequalities in (O, we obtain 



To contradict this, let 



e < ( - 

K 



We again bound the terms that involve e from below by a multiplicative term 1 — a' . Optimizing 
on all possible values of fi, gives jij = }i for all f = 1 . . . whereupon the expression on the right 

can be replaced by (1 — oi)s[^^^ The proof is thus complete, i 

Numerically, the first values of the decoding radius p given by Theorem [TT] are 

p ^ ^— for f = 3 p ^ ^— for f = 4 
^ 5.94 ^ 6.46 

attained for k satisfying [k — 1)^^ = 1 — t/K and ^ = or 1. 

Can one obtain better bounds for the decoding radius ? In principle, it is possible to obtain 
further improvements by introducing multiple thresholds instead of the single decoding threshold 
6 = d\lK, and approach p = 3/2hy increasing their number. However we shall only be able 
to claim that using one of the multiple thresholds reduces the number of errors for one of the 
subprocedures, but we shall not be able to discern which decoding threshold achieves that. This 
will result in yet another layer of parallelism, further increasing the value of the constant in the 
decoding complexity. We will not pursue this line of research further here. A remaining challenge 
is to decode up to half the designed distance with an iterative decoding procedure of reasonable 
complexity. 
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